TedTest

From the Ted Talk by Joseph Redmon: How computers learn to recognize objects instantly

Unscramble the Blue Letters

So in just a few yaers, we've gone from 20 seconds per image to 20 milliseconds per image, a tsunaohd times feastr. How did we get there? Well, in the past, object dittcoeen systems would take an image like this and slpit it into a bunch of regions and then run a classifier on each of these regions, and high scores for that classifier would be considered detections in the image. But this involved running a cialsifesr thousands of times over an image, thousands of neural network evaluations to produce detection. Instead, we trained a single network to do all of detection for us. It produces all of the bounding boxes and class probabilities simultaneously. With our sstyem, instead of looking at an igmae thousands of times to produce detection, you only look once, and that's why we call it the YOLO metohd of object detection. So with this speed, we're not just limited to images; we can process video in real time. And now, instead of just seeing that cat and dog, we can see them move around and iercantt with each other.

Open Cloze

So in just a few _____, we've gone from 20 seconds per image to 20 milliseconds per image, a ________ times ______. How did we get there? Well, in the past, object _________ systems would take an image like this and _____ it into a bunch of regions and then run a classifier on each of these regions, and high scores for that classifier would be considered detections in the image. But this involved running a __________ thousands of times over an image, thousands of neural network evaluations to produce detection. Instead, we trained a single network to do all of detection for us. It produces all of the bounding boxes and class probabilities simultaneously. With our ______, instead of looking at an _____ thousands of times to produce detection, you only look once, and that's why we call it the YOLO ______ of object detection. So with this speed, we're not just limited to images; we can process video in real time. And now, instead of just seeing that cat and dog, we can see them move around and ________ with each other.

Solution

faster
detection
method
thousand
years
interact
system
image
classifier
split

Original Text

So in just a few years, we've gone from 20 seconds per image to 20 milliseconds per image, a thousand times faster. How did we get there? Well, in the past, object detection systems would take an image like this and split it into a bunch of regions and then run a classifier on each of these regions, and high scores for that classifier would be considered detections in the image. But this involved running a classifier thousands of times over an image, thousands of neural network evaluations to produce detection. Instead, we trained a single network to do all of detection for us. It produces all of the bounding boxes and class probabilities simultaneously. With our system, instead of looking at an image thousands of times to produce detection, you only look once, and that's why we call it the YOLO method of object detection. So with this speed, we're not just limited to images; we can process video in real time. And now, instead of just seeing that cat and dog, we can see them move around and interact with each other.

Frequently Occurring Word Combinations

ngrams of length 2

collocation	frequency
computer vision	5
object detection	4
real time	3
neural network	2
bounding boxes	2
times faster	2
detection system	2
stop signs	2

Important Words

bounding
boxes
bunch
call
cat
class
classifier
considered
detection
detections
dog
evaluations
faster
high
image
interact
involved
limited
method
milliseconds
move
network
neural
object
probabilities
process
produce
produces
real
regions
run
running
scores
seconds
simultaneously
single
speed
split
system
systems
thousand
thousands
time
times
trained
video
years
yolo